28. Use TfIdf to Get the Most Important Word

Use TfIdf to Get the Most Important Word

Question:

In order to figure out what words are causing the problem, you need to go back to the TfIdf and use the feature numbers that you obtained in the previous part of the mini-project to get the associated words. You can return a list of all the words in the TfIdf by calling get_feature_names() on it; pull out the word that’s causing most of the discrimination of the decision tree. What is it? Does it make sense as a word that’s uniquely tied to either Chris Germany or Sara Shackleton, a signature of sorts?

Start Quiz:

INSTRUCTOR NOTE:

Make sure you write your modifications to find_signature.py to obtain the most impactful word.